You are here: Planning the Model > Steps for Doing Simulation > Step 2: Defining the System
With clearly defined objectives and a well organized plan for the study, the system that will be simulated can begin to be defined in detail. This can be viewed as the development of a conceptual model on which the simulation model will be based. The process of gathering and validating system information can be overwhelming when faced with the stacks of uncorrelated data to sort through. Data is seldom available in a form that defines exactly how the system works. Many data gathering efforts end up with lots of data but very little useful information.
Data gathering should never be performed without a purpose. Rather than being haphazard, data gathering should be goal oriented with a focus on information that will achieve the objectives of the study. There are several guidelines to keep in mind when gathering data.
	Identify cause-and-effect relationships   It is important to correctly identify the causes or conditions under which activities are performed. In gathering downtime data, for example, it is helpful to distinguish between downtimes due to equipment failure or personal emergencies and planned downtimes for break. Once the causes have been established and analyzed, activities can be properly categorized.
	Look for key impact factors   Discrimination should be used when gathering data to avoid wasting time examining factors that have little or no impact on system performance. If, for example, an operator is dedicated to a particular task and, therefore, is never a cause of delays in service, there is no need to include the operator in the model. Likewise, extremely rare downtimes, negligible move times and other insignificant or irrelevant activities that have no appreciable effect on routine system performance may be safely ignored. 
	Distinguish between time and condition dependent activities   Time-dependent activities are those that take a predictable amount of time to complete, such as customer service. Condition-dependent activities can only be completed when certain defined conditions within the system are satisfied. Because condition-dependent activities are uncontrollable, they are unpredictable. An example of a condition-dependent activity might be the approval of a loan application contingent upon a favorable credit check. 
Many activities are partially time-dependent and partially condition-dependent. When gathering data on these activities, it is important to distinguish between the time actually required to perform the activity and the time spent waiting for resources to become available or other conditions to be met before the activity can be performed. If, for example, historical data is used to determine repair times, the time spent doing the actual repair work should be used without including the time spent waiting for a repair person to become available. 
	Focus on essence rather than substance   A system definition for modeling purposes should capture the key cause-and-effect relationships and ignore incidental details. Using this "black box" approach to system definition, we are not concerned about the nature of the activity being performed, but only the impact that the activity has on the use of resources and the delay of entity flow. For example, the actual operation performed on a machine is not important, but only how long the operation takes and what resources, if any, are tied up during the operation. It is important for the modeler to be constantly thinking abstractly about the system operation in order to avoid getting too caught up in the incidental details.
 Separate input variables from response variables Input variables in a model define how the system works (e.g., activity times, routing sequences, etc.). Response variables describe how the system responds to a given set of input variables (e.g., work-in-process, idle times, resource utilization, etc.). Input variables should be the focus of data gathering since they are used to define the model. Response variables, on the other hand, are the output of a simulation. Consequently, response variables should only be gathered later to help validate the model once it is built and run.
These guidelines should help ensure that the system is thought of in the proper light for simulation purposes.
To help organize the process of gathering data for defining the system, the following steps are recommended:
 Determine data requirements.
 Use appropriate data sources.
 Make assumptions where necessary.
 Convert data into a useful form.
 Document and approve the data.
Each of these steps is explained on the following pages.